Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Doppler ICP algorithm in registration pipeline and dopplers field in PointCloud #5220

Closed
wants to merge 9 commits into from

Conversation

heethesh
Copy link
Contributor

@heethesh heethesh commented Jun 20, 2022

We would like to contribute the implementation of our Doppler ICP algorithm for point clouds captured by FMCW LiDARs. This is the implementation of the following paper:

@INPROCEEDINGS{Hexsel-RSS-22, 
    AUTHOR    = {Bruno Hexsel AND Heethesh Vhavle AND Yi Chen}, 
    TITLE     = {{DICP: Doppler Iterative Closest Point Algorithm}}, 
    BOOKTITLE = {Proceedings of Robotics: Science and Systems}, 
    YEAR      = {2022}, 
    ADDRESS   = {New York City, NY, USA}, 
    MONTH     = {June}, 
    DOI       = {10.15607/RSS.2022.XVIII.015} 
}
  • Added new DopplerICP algorithm in open3d::pipelines::registration.
  • C++ and Python examples for Doppler ICP.
  • Added dopplers field in open3d::geometry::PointCloud class.
  • Added converged and num_iterations in RegistrationResult.

This change is Reviewable

@ssheorey ssheorey requested a review from yxlao June 20, 2022 04:04
@yxlao yxlao requested a review from reyanshsolis June 21, 2022 07:39
Copy link
Collaborator

@yxlao yxlao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 13 of 21 files at r1, 7 of 8 files at r2, all commit messages.
Reviewable status: 20 of 21 files reviewed, 5 unresolved discussions (waiting on @heethesh and @reyanshsolis)


cpp/open3d/geometry/PointCloud.h line 451 at r2 (raw file):

    std::vector<Eigen::Vector3d> colors_;
    /// Doppler velocity of points.
    std::vector<double> dopplers_;

Adding a new hard-coded field to the geometry::PointCloud class is not very ideal, as this field is only used in the DICP algorithm. The new t::geometry::PointCloud handles this problem since we can attach arbitrary property (as tensor) to the class. I recommend moving the implementation to the t::geometry::PointCloud class. Same for the pipeline, we should move to t::pipeline.

You may check out cpp/open3d/t/pipelines/registration/Registration.h for reference. For simplicity, we may focus on the CPU implementation for now. @reyanshsolis can answer any questions that you may have regarding the tensor-based registration.


cpp/open3d/io/PointCloudIO.h line 150 at r2 (raw file):

                          const WritePointCloudOption &params);

bool ReadPointCloudFromXYZD(const std::string &filename,

Similarly, these functions can be moved to t::io


examples/cpp/RegistrationDopplerICP.cpp line 97 at r2 (raw file):

    std::shared_ptr<geometry::PointCloud> target =
            open3d::io::CreatePointCloudFromFile(argv[2]);
    if (source == nullptr || target == nullptr) {

it would be the best to use one of the example datasets, if the user does not specify any path, e.g. http://www.open3d.org/docs/latest/tutorial/data/index.html#demoicppointclouds


examples/cpp/RegistrationDopplerICP.cpp line 146 at r2 (raw file):

    return 0;
}

We also need to add a C++ unit test (cpp/tests/t/pipelines/registration/Registration.cpp) before migrating to the t:: namespace.

  1. Use the example dataset
  2. Run the DICP for a fixed number of iterations, using fixed parameter
  3. Assert the output to be close to some numerical values

This unit test will make sure that, after your code is migrated to the t:: namespace, the numerical values remain the same.


examples/python/pipelines/doppler_icp.py line 55 at r2 (raw file):

    args = parser.parse_args()

    source = o3d.io.read_point_cloud(args.source)

same, it would be the best to use one of the example datasets, if the user does not specify any path, e.g. http://www.open3d.org/docs/latest/tutorial/data/index.html#demoicppointclouds

@heethesh
Copy link
Contributor Author

@yxlao Agreed, my main concern was adding the dopplers field too. I already have a CPU+GPU tensor version implemented without the need of another hard-coded field, I will have to check with my team and get approvals to open-source this.

Regarding the dataset for tests, the Doppler ICP algorithm requires point clouds from FMCW sensors with per-point velocities. We have a couple of sample datasets generated in CARLA here. I can pick a small subset from this for the examples. Where do I add the examples? Is that another repository you maintain or LFS?

@heethesh
Copy link
Contributor Author

heethesh commented Jun 21, 2022

Practically, I found that the runtime with legacy registration pipeline (with OpenMP) was the best, followed by tensor GPU and the CPU tensor version being extremely slow. Do you have any plans to support additional variable/non hard-coded fields (keeping points, colors, and normals as it is) with the legacy PointCloud class?

@reyanshsolis
Copy link
Collaborator

Practically, I found that the runtime with legacy registration pipeline (with OpenMP) was the best, followed by tensor GPU and the CPU tensor version being extremely slow. Do you have any plans to support additional variable/non-hard-coded fields (keeping points, colors, and normals as it is) with the legacy PointCloud class?

The tensor-based implementation performance is better than legacy, but it is a bit more sensitive to the search radius parameters. So, given the parameters are tuned Tensor GPU >> Eigen/Legacy CPU ~ Tensor CPU.

Tensor-based modules are the new modules, and we intend to keep improving their performance of the same, whereas the legacy might get depreciated in the future.

I can help you migrate the same quickly to the tensor-based pipeline.

@yxlao
Copy link
Collaborator

yxlao commented Jun 21, 2022

following by tensor GPU and the CPU tensor version being extremely slow. Do you have any plans to support additional variable/non hard-coded fields (keeping points, colors, and normals as it is) with the legacy PointCloud class?

The most likely reason is in the implementation. If we use a for loop with repetitive tensor slicing and indexing, it will be slow. For best performance, we should use Tensors:

  1. as a buffer with pointer access to the values, this means writing custom kernels
  2. tensor ops (e.g. Tensor::Add(), Mul(), ...) with the vectorized implementation

In the following benchmark, the tensor-based ICP is faster than legacy:

$ make benchmarks -j && ./bin/benchmarks --benchmark_filter=".*ICP.*"

-------------------------------------------------------------------------------------
Benchmark                                           Time             CPU   Iterations
-------------------------------------------------------------------------------------
BenchmarkICPLegacy/PointToPlane / CPU            44.4 ms         40.8 ms           20
BenchmarkICPLegacy/PointToPoint / CPU             104 ms          104 ms            8
BenchmarkICP/"CPU:0" PointToPoint_Float32        31.4 ms         31.2 ms           23
BenchmarkICP/"CPU:0" PointToPoint_Float64        34.2 ms         31.1 ms           22
BenchmarkICP/"CPU:0" PointToPlane_Float32        33.4 ms         33.2 ms           22
BenchmarkICP/"CPU:0" PointToPlane_Float64        33.2 ms         33.1 ms           22
BenchmarkICP/"CPU:0" ColoredICP_Float32          39.9 ms         39.9 ms           16
BenchmarkICP/"CPU:0" ColoredICP_Float64          82.4 ms         69.3 ms           12
BenchmarkICP/"CUDA:0" PointToPoint_Float32       15.0 ms         15.0 ms           41
BenchmarkICP/"CUDA:0" PointToPoint_Float64       24.1 ms         24.1 ms           28
BenchmarkICP/"CUDA:0" PointToPlane_Float32       9.07 ms         9.07 ms           88
BenchmarkICP/"CUDA:0" PointToPlane_Float64       15.4 ms         15.4 ms           46
BenchmarkICP/"CUDA:0" ColoredICP_Float32         11.3 ms         11.3 ms           62
BenchmarkICP/"CUDA:0" ColoredICP_Float64         20.4 ms         20.4 ms           34

@heethesh
Copy link
Contributor Author

Okay, let me create a PR with the kernels I implemented in the tensor pipeline for your review.

@reyanshsolis
Copy link
Collaborator

@heethesh if you want, we can have a zoom meeting too. So, that I can help you migrate the kernels easily.

@heethesh
Copy link
Contributor Author

@reyanshsolis I have a working implementation of CPU/GPU kernels in the tensor pipeline implemented already. I'll try to get the PR for it up soon, and we can probably set up a call to help optimize it after a first review.

@reyanshsolis
Copy link
Collaborator

reyanshsolis commented Jun 21, 2022

@heethesh
There was some difference in benchmark parms earlier, even after fixing that in #5230 tensor ICP shows better results over legacy.

The tensor-based implementation uses radius search, then takes apply the distance threshold, as compared to legacy using knn search, which makes tensor highly sensitive to radius search parameter (max_correspondence_distance threshold).
So under a condition of poor parameters, the legacy performance might stay unaffected, whereas tensor-based performance will be bad.

Voxel size 0.02

max_correspondence_distance 0.05 (generally it sould be around 1.4x voxel size)

-------------------------------------------------------------------------------------
Benchmark                                           Time             CPU   Iterations
-------------------------------------------------------------------------------------
BenchmarkICPLegacy/PointToPlane / CPU            66.5 ms         66.5 ms           11
BenchmarkICPLegacy/PointToPoint / CPU            82.1 ms         82.1 ms            9
BenchmarkICP/"CPU:0" PointToPoint_Float32        66.3 ms         65.7 ms           10
BenchmarkICP/"CPU:0" PointToPoint_Float64        61.1 ms         60.6 ms           12
BenchmarkICP/"CPU:0" PointToPlane_Float32        67.5 ms         67.5 ms           10
BenchmarkICP/"CPU:0" PointToPlane_Float64        67.1 ms         67.1 ms           11
BenchmarkICP/"CPU:0" ColoredICP_Float32          91.4 ms         91.3 ms            8
BenchmarkICP/"CPU:0" ColoredICP_Float64           111 ms          110 ms            6
BenchmarkICP/"CUDA:0" PointToPoint_Float32       18.4 ms         18.4 ms           39
BenchmarkICP/"CUDA:0" PointToPoint_Float64       35.4 ms         35.4 ms           19
BenchmarkICP/"CUDA:0" PointToPlane_Float32       9.87 ms         9.87 ms           76
BenchmarkICP/"CUDA:0" PointToPlane_Float64       27.3 ms         27.3 ms           26
BenchmarkICP/"CUDA:0" ColoredICP_Float32         13.7 ms         13.6 ms           42
BenchmarkICP/"CUDA:0" ColoredICP_Float64         33.7 ms         33.7 ms           21

max_correpondence_distance 0.1 (very poor init)

-------------------------------------------------------------------------------------
Benchmark                                           Time             CPU   Iterations
-------------------------------------------------------------------------------------
BenchmarkICPLegacy/PointToPlane / CPU            53.9 ms         53.9 ms           10
BenchmarkICPLegacy/PointToPoint / CPU            78.8 ms         78.8 ms           10
BenchmarkICP/"CPU:0" PointToPoint_Float32         200 ms          199 ms            3
BenchmarkICP/"CPU:0" PointToPoint_Float64         205 ms          200 ms            3
BenchmarkICP/"CPU:0" PointToPlane_Float32         184 ms          184 ms            4
BenchmarkICP/"CPU:0" PointToPlane_Float64         192 ms          187 ms            4
BenchmarkICP/"CPU:0" ColoredICP_Float32           251 ms          240 ms            3
BenchmarkICP/"CPU:0" ColoredICP_Float64           260 ms          259 ms            3
BenchmarkICP/"CUDA:0" PointToPoint_Float32       22.9 ms         22.9 ms           29
BenchmarkICP/"CUDA:0" PointToPoint_Float64       57.0 ms         57.0 ms           12
BenchmarkICP/"CUDA:0" PointToPlane_Float32       13.6 ms         13.6 ms           51
BenchmarkICP/"CUDA:0" PointToPlane_Float64       47.1 ms         47.1 ms           15
BenchmarkICP/"CUDA:0" ColoredICP_Float32         17.3 ms         17.3 ms           37
BenchmarkICP/"CUDA:0" ColoredICP_Float64         61.0 ms         60.9 ms           12

So, given parameters in an acceptable range, the tensor-based API provided better performance and also a lot of flexibility and control over performance, such as multi-scale ICP, custom robust kernels for outlier rejection, and real-time update of intermediate results through lambda function.

@heethesh
Copy link
Contributor Author

@reyanshsolis @yxlao please see my new PR #5237

@reyanshsolis
Copy link
Collaborator

Appreciate the contribution and effort. But we will be merging new features to the tensor-based pipeline in the interest of a stable tensor API and depreciating the eigen-based legacy api.
Therefore, closing this PR. Looking forward to #5237

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants